AITopics | task-specific layer

Collaborating Authors

task-specific layer

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Sheaf-Based Decentralized Multimodal Learning for Next-Generation Wireless Communication Systems

Ghalkha, Abdulmomen, Tian, Zhuojun, Issaid, Chaouki Ben, Bennis, Mehdi

arXiv.org Artificial IntelligenceJun-30-2025

--In large-scale communication systems, increasingly complex scenarios require more intelligent collaboration among edge devices collecting various multimodal sensory data to achieve a more comprehensive understanding of the environment and improve decision-making accuracy. However, conventional federated learning (FL) algorithms typically consider unimodal datasets, require identical model architectures, and fail to leverage the rich information embedded in multimodal data, limiting their applicability to real-world scenarios with diverse modalities and varying client capabilities. T o address this issue, we propose Sheaf-DMFL, a novel decentralized multimodal learning framework leveraging sheaf theory to enhance collaboration among devices with diverse modalities. Specifically, each client has a set of local feature encoders for its different modalities, whose outputs are concatenated before passing through a task-specific layer . While encoders for the same modality are trained collaboratively across clients, we capture the intrinsic correlations among clients' task-specific layers using a sheaf-based structure. T o further enhance learning capability, we propose an enhanced algorithm named Sheaf-DMFL-Att, which tailors the attention mechanism within each client to capture correlations among different modalities. A rigorous convergence analysis of Sheaf-DMFL-Att is provided, establishing its theoretical guarantees. Extensive simulations are conducted on real-world link blockage prediction and mmWave beamforming scenarios, demonstrate the superiority of the proposed algorithms in such heterogeneous wireless communication systems. Index T erms--Distributed optimization, decentralized learning, multimodal federated learning, next-generation wireless communication.

artificial intelligence, machine learning, modality, (16 more...)

arXiv.org Artificial Intelligence

2506.22374

Country: Europe (0.28)

Genre: Research Report > New Finding (0.92)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Black-Box Privacy Attacks on Shared Representations in Multitask Learning

Abascal, John, Berrios, Nicolás, Oprea, Alina, Ullman, Jonathan, Smith, Adam, Jagielski, Matthew

arXiv.org Artificial IntelligenceJun-23-2025

Multitask learning (MTL) has emerged as a powerful paradigm that leverages similarities among multiple learning tasks, each with insufficient samples to train a standalone model, to solve them simultaneously while minimizing data sharing across users and organizations. MTL typically accomplishes this goal by learning a shared representation that captures common structure among the tasks by embedding data from all tasks into a common feature space. Despite being designed to be the smallest unit of shared information necessary to effectively learn patterns across multiple tasks, these shared representations can inadvertently leak sensitive information about the particular tasks they were trained on. In this work, we investigate what information is revealed by the shared representations through the lens of inference attacks. Towards this, we propose a novel, black-box task-inference threat model where the adversary, given the embedding vectors produced by querying the shared representation on samples from a particular task, aims to determine whether that task was present when training the shared representation. We develop efficient, purely black-box attacks on machine learning models that exploit the dependencies between embeddings from the same task without requiring shadow models or labeled reference data. We evaluate our attacks across vision and language domains for multiple use cases of MTL and demonstrate that even with access only to fresh task samples rather than training data, a black-box adversary can successfully infer a task's inclusion in training. To complement our experiments, we provide theoretical analysis of a simplified learning setting and show a strict separation between adversaries with training samples and fresh samples from the target task's distribution.

adversary, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2506.1646

Country: North America > United States > California (0.93)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation > Air (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Unified Attention Modeling for Efficient Free-Viewing and Visual Search via Shared Representations

Mohammed, Fatma Youssef, Alexis, Kostas

arXiv.org Artificial IntelligenceJun-4-2025

Computational human attention modeling in free-viewing and task-specific settings is often studied separately, with limited exploration of whether a common representation exists between them. This work investigates this question and proposes a neural network architecture that builds upon the Human Attention transformer (HAT) to test the hypothesis. Our results demonstrate that free-viewing and visual search can efficiently share a common representation, allowing a model trained in free-viewing attention to transfer its knowledge to task-driven visual search with a performance drop of only 3.86% in the predicted fixation scanpaths, measured by the semantic sequence score (SemSS) metric which reflects the similarity between predicted and human scanpaths. This transfer reduces computational costs by 92.29% in terms of GFLOPs and 31.23% in terms of trainable parameters.

artificial intelligence, information management, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2506.02764

Country: Europe (0.28)

Genre: Research Report > New Finding (0.54)

Industry: Health & Medicine > Therapeutic Area (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Information Management > Search (0.89)

Add feedback

Tint Your Models Task-wise for Improved Multi-task Model Merging

Jung, Aecheon, Lee, Seunghwan, Han, Dongyoon, Hong, Sungeun

arXiv.org Artificial IntelligenceDec-26-2024

Traditional model merging methods for multi-task learning (MTL) address task conflicts with straightforward strategies such as weight averaging, sign consensus, or minimal test-time adjustments. This presumably counts on the assumption that a merged encoder still retains abundant task knowledge from individual encoders, implying that its shared representation is sufficiently general across tasks. However, our insight is that adding just a single trainable task-specific layer further can bring striking performance gains, as demonstrated by our pilot study. Motivated by this finding, we propose Model Tinting, a new test-time approach that introduces a single task-specific layer for each task as trainable adjustments. Our method jointly trains merging coefficients and task-specific layers, which effectively reduces task conflicts with minimal additional costs. Additionally, we propose a sampling method that utilizes the difference in confidence levels of both merged and individual encoders. Extensive experiments demonstrate our method's effectiveness, which achieves state-of-the-art performance across both computer vision and natural language processing tasks and significantly surpasses prior works. Our code is available at https://github.com/AIM-SKKU/ModelTinting.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2412.19098

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Multi-task Representation Learning for Mixed Integer Linear Programming

Cai, Junyang, Huang, Taoan, Dilkina, Bistra

arXiv.org Artificial IntelligenceDec-18-2024

Mixed Integer Linear Programs (MILPs) are highly flexible and powerful tools for modeling and solving complex real-world combinatorial optimization problems. Recently, machine learning (ML)-guided approaches have demonstrated significant potential in improving MILPsolving efficiency. However, these methods typically rely on separate offline data collection and training processes, which limits their scalability and adaptability. This paper introduces the first multi-task learning framework for ML-guided MILP solving. The proposed framework provides MILP embeddings helpful in guiding MILP solving across solvers (e.g., Gurobi and SCIP) and across tasks (e.g., Branching and Solver configuration). Through extensive experiments on three widely used MILP benchmarks, we demonstrate that our multi-task learning model performs similarly to specialized models within the same distribution. Moreover, it significantly outperforms them in generalization across problem sizes and tasks. Keywords: Deep Learning Mixed Integer Linear Programming Multitask Learning Graph Neural Networks.

learning, milp, single-task, (13 more...)

arXiv.org Artificial Intelligence

2412.14409

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > France > Bourgogne-Franche-Comté > Doubs > Besançon (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Projected Task-Specific Layers for Multi-Task Reinforcement Learning

Roberts, Josselin Somerville, Di, Julia

arXiv.org Artificial IntelligenceSep-15-2023

Multi-task reinforcement learning could enable robots to scale across a wide variety of manipulation tasks in homes and workplaces. However, generalizing from one task to another and mitigating negative task interference still remains a challenge. Addressing this challenge by successfully sharing information across tasks will depend on how well the structure underlying the tasks is captured. In this work, we introduce our new architecture, Projected Task-Specific Layers (PTSL), that leverages a common policy with dense task-specific corrections through task-specific layers to better express shared and variable task information. We then show that our model outperforms the state of the art on the MT10 and MT50 benchmarks of Meta-World consisting of 10 and 50 goal-conditioned tasks for a Sawyer arm.

architecture, projection, ptsl, (15 more...)

arXiv.org Artificial Intelligence

2309.08776

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.75)

Add feedback

QBERT: Generalist Model for Processing Questions

Xu, Zhaozhen, Cristianini, Nello

arXiv.org Artificial IntelligenceDec-4-2022

Using a single model across various tasks is beneficial for training and applying deep neural sequence models. We address the problem of developing generalist representations of text that can be used to perform a range of different tasks rather than being specialised to a single application. We focus on processing short questions and developing an embedding for these questions that is useful on a diverse set of problems, such as question topic classification, equivalent question recognition, and question answering. This paper introduces QBERT, a generalist model for processing questions. With QBERT, we demonstrate how we can train a multi-task network that performs all question-related tasks and has achieved similar performance compared to its corresponding single-task models.

classification, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2212.01967

Country:

Europe > United Kingdom (0.04)
North America > United States > District of Columbia > Washington (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

BERN2: an advanced neural biomedical named entity recognition and normalization tool

Sung, Mujeen, Jeong, Minbyul, Choi, Yonghwa, Kim, Donghyeon, Lee, Jinhyuk, Kang, Jaewoo

arXiv.org Artificial IntelligenceOct-6-2022

In biomedical natural language processing, named entity recognition (NER) and named entity normalization (NEN) are key tasks that enable the automatic extraction of biomedical entities (e.g. diseases and drugs) from the ever-growing biomedical literature. In this article, we present BERN2 (Advanced Biomedical Entity Recognition and Normalization), a tool that improves the previous neural network-based NER tool by employing a multi-task NER model and neural network-based NEN models to achieve much faster and more accurate inference. We hope that our tool can help annotate large-scale biomedical texts for various tasks such as biomedical knowledge graph construction.

artificial intelligence, bern2, natural language, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1093/bioinformatics/btac598

2201.0208

Country:

Asia > South Korea > Seoul > Seoul (0.06)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Switzerland > Geneva > Geneva (0.04)

Genre: Research Report (0.40)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback

Coarse-to-Fine: Hierarchical Multi-task Learning for Natural Language Understanding

Fei, Zhaoye, Tian, Yu, Wu, Yongkang, Zhang, Xinyu, Zhu, Yutao, Liu, Zheng, Wu, Jiawen, Kong, Dejiang, Lai, Ruofei, Cao, Zhao, Dou, Zhicheng, Qiu, Xipeng

arXiv.org Artificial IntelligenceAug-18-2022

Generalized text representations are the foundation of many natural language understanding tasks. To fully utilize the different corpus, it is inevitable that models need to understand the relevance among them. However, many methods ignore the relevance and adopt a single-channel model (a coarse paradigm) directly for all tasks, which lacks enough rationality and interpretation. In addition, some existing works learn downstream tasks by stitches skill block(a fine paradigm), which might cause irrationalresults due to its redundancy and noise. Inthis work, we first analyze the task correlation through three different perspectives, i.e., data property, manual design, and model-based relevance, based on which the similar tasks are grouped together. Then, we propose a hierarchical framework with a coarse-to-fine paradigm, with the bottom level shared to all the tasks, the mid-level divided to different groups, and the top-level assigned to each of the tasks. This allows our model to learn basic language properties from all tasks, boost performance on relevant tasks, and reduce the negative impact from irrelevant tasks. Our experiments on 13 benchmark datasets across five natural language understanding tasks demonstrate the superiority of our method.

multi-task learning, relevance, task relevance, (15 more...)

arXiv.org Artificial Intelligence

2208.09129

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(9 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Understanding (0.81)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Filters

Collaborating Authors

task-specific layer

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

c8067ad1937f728f51288b3eb986afaa-Paper.pdf

Sheaf-Based Decentralized Multimodal Learning for Next-Generation Wireless Communication Systems

Black-Box Privacy Attacks on Shared Representations in Multitask Learning

Unified Attention Modeling for Efficient Free-Viewing and Visual Search via Shared Representations

Tint Your Models Task-wise for Improved Multi-task Model Merging

Multi-task Representation Learning for Mixed Integer Linear Programming

Projected Task-Specific Layers for Multi-Task Reinforcement Learning

QBERT: Generalist Model for Processing Questions

BERN2: an advanced neural biomedical named entity recognition and normalization tool

Coarse-to-Fine: Hierarchical Multi-task Learning for Natural Language Understanding